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ABSTRACT 

The purpose of this study was to investigate whether 
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grade students from Japan, Belgium, Canada (British Columbia and 
Ontario), France, the United States, New Zealand, and Thailand. 
Students were tested on the same mathematical items and problems. 
Results indicated no substantial gender effects or differences in any 
of the mathematics content areas, problem types, or national origins. 
Other studies suggested that curricula, pedagogies, or cultural 
factors may interact with gender differences in impacting 
quantitative performance. Tables and 21 references are included. 
(JriP) 



xxxxxxxxxxxxxxxxxxxxxxxxx^^ 

X Reproductions supplied by EDRS are the best that can be made X 
X from the original document. X 

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 



9 

ERLC 



o 

eg 



GENDER DIFFERENCES IN MATHEMATICS : AN INTERNATIONAL PERSPECTIVE 



UJ 



Corinna A. J^thington 
University of Illinois at Chicago 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC)." 



US DEPARTMENT OF EDUCATION 

Off.ee ol Educational Research and improvement 

EOUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 



rThis document has been reproduced as 
received from the person or organization 
orig.nating it 

□ Minor changes have been made to improve 
reproduction quality 



Points of view or opinions stated in this docu 
ment do not necessarily represent ofhciai 
OERl position or policy 



This paper was prepared for presentation at the annual meeting of the 
American Educational Research Association, New Orleans, 1988. 



ERIC 



BEST COPY AVAILABLE 



GENDER DIFFERENCES IN MATHEMATICS: AN INTERNATIONAL PERSPECTIVE 

Gender related differences on measures of quantitative performance and 
problem solving abilities consistently appear in national assessments (e.g., 
Fennema and Carpenter, 1981; National Assessment of Educational Progress, 1975, 
1983; Wilson, 1972). Using a variety of performance measures, investigators 
have examined the nature of these differences and factors associated with them 
for subjects varying in age from elementary school to undergraduates in college. 
From these studies, it is generally concluded that no gender differences are 
evidenced at the elementary school level, but beginning at approximately the 
seventh grade, any differences that appear favor males (see Fennema (1974, 1980) 
for a review of this literature.) 

However, gender differences in mathematics achievement have not been found 
to be consistent across countries. Walberg, Harnisch, & Tsai (1986) found 
differences favoring males after controlling for productivity factors in eight 
of the twelve countries studied and no gender differences in the remaining four. 
In the first IEA study, Husen ( 1967) found differences in achievement to 
generally favor males, but differences within countries were not always 
significant. Husen also noted, however, that gender differences were a within- 
country phenomenon and that across countries, girls may be superior to boys. 
These between country differences would be attributable to curricular and 
instructional differences which mirror cultural values. In contrast to these 
predominant findings of superior performance by males, in a study of Hawaiian 
students, Brandon, Newton, & Hammond (1987) often found differences favoring 
females among Japanese-American, Filipino-American, and Hawaiian students but 
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not among Caucasian students. It is such findings that lend support to the 
suggestion made by Leder (1986) that "a clear recognition of the values, 
expectations and beliefs of the wider society within which learning takes place 
is required for a full appreciation of the currently found sex differences in 
mathematics participation and performance" (p. 6). 

Just as higher performance by males on measures of overall mathematics 
achievement have not been consistent across countries, another body of research 
notes that the size and direction of gender differences in quantitative 
performance vary according to problem type. These studies have been conducted 
with subjects of varying ages and educational backgrounds and have resulted in 
reasonably consiscent conclusions. Several types of problems have been 
identified in which gender differences appear. Males have been found to excel 
in problems dealing with measurement and proportionality (Bart, Baxter, 4 Frey, 
1980; Fennema, 1980; Fennema 4 Carpenter, 1981; Pattison 4 Grieve, 1984; Wood, 
1976) and in problems with a spatial component (Fennema, 198O; Fennema 4 
Carpenter, 1981; Pattison 4 Grieve, 1984), whereas females were found to perform 
better on items testing computational skills (Fennema, 1974; Jarvis, 1964; 
Meece, Parsons, Kaczala, Goff, 4 Futterman, 1982) and those involving more 
abstract deductive reasoning such as the algebra of sets (Wood, 1976) and 
problems involving the construction and analysis of symbolic relationships 
(Pattison 4 Grieve, 1 9 84.) Additionally, Maier 4 Casselman (1971) conclude that 
while males consistently score higher overall tlian females or. problem solving 
tests, women's best performance was on idea-getting problems rather than on 
problems that required making essential distinctions. 
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Wood (1976) noted that in some of the schools involved in his study, gender 
differences within the schools were greater than those for the whole sample, 
while within others the differences vanished or reversed. He subsequently 
suggested that a fundamental factor in the presence or absence of gender 
differences may be the style of instruction. If differences occur within 
countries, it is also possible that the performance by males and females on 
specific item types could vary across country just as performance on overall 
achievement measures do. Should this be the case, it would suggest that the 
curricula, pedagogy, and culture interact with gender in affecting quantitative 
performance. 

The recently completed Second International Mathematics Study (SIMS) 
provides an opportunity to determine if the cultural differences suggested by 
Leder ( 1986) are manifested in item type differences as well as in overall 
performance. Twenty-four countries around the world participated in this 
comprehensive study of school mathematics in which students of approximately the 
same age and grade level were administered the same core items. Individual item 
performance was recorded for each student, and thus, performance on item type as 
well as overall performance can be compared across country as well as gender. 
The purpose of this study was to investigate whether the patterns of gender 
differences on specific problem types evidenced in previous studies were 
consistent across the countries involved in the SIMS longitudinal study. 

DATA 

Data for this study were drawn from the Population A longitudinal data file 
of the Second International Mathematics Study (SIMS), a comprehensive survey of 
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the teaching and learning of mathematics in countries around the world. The 
longitudinal data file contained data for eight of the twenty-four countries 
involved in the study. They were Japan, Flemish Belgium, British Columbia, 
France, Ontario, United States, New Zealand, and Thailand. Population A was 
defined to be the eighth grade in the United States and other countries and 
seventh grade in Japan. Population A represents a grade level where 
approximately all of the students in most of the participating countries are 
still studying mathematics in a common program (Crosswhite, Dossey, Swafford, 
McKnight, Cooney, Downs, Grouws, & Weinzweig, 1986). 

Students were tested at the beginning and end of the 1981-82 academic 
school year using internationally developed mathematics achievement tests. The 
items on the achievement tests used in the SIMS study were developed such that 
the mathematics curriculum of each participating country was adequately sampled. 
All of the students in Population A took a core test and one of four rotated 
forms constructed using item sampling procedures. Post-test performance on the 
core items common to each country was analyzed in the present study. Using the 
longitudinal form conytruction strata, the items were clustered according to 
content areas. The content areas were fractions, ratio/proportion/perceht, 
algebra, geometry, an J measurement. The percent of items correct within each 
cluster was then computed for each student and averaged across country by 
gender. TabJe 1 presents these averages. 



Insert Table 1 About Here 
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METHODOLOGY 

Analyses were conducted using the exploratory data analysis method of 
median polishing (see: Tukey, 1977; Velleman and Hoaglin, 1981). The 
exploratory approach of this method does not test hypotheses but involves a 
decomposition of the data, producing patterns of effects that are not 
necessarily apparent in the summary data. 1 The median polish decomposes the 
data into a common effect, an effect associated with gender, an effect 
associated with country, and a residual. The common effect is interpreted as a 
typical score for the entire sample of students. The gender and country effects 
then indicate performance relative to this typical score that would be expected 
for a student of specified gender and nationality. 

The model used in this study is similar to the additive model of analysis 
of variance but uses medians rather than means to describe common effects, row 
effects, and column effects. For the factors involved in this study the model 
is 

Xjj = M + Gj + Cj + eij 

where Xjj is the mean proportion for gender i in country j; M is the common 
effect (median across countries); Gj is the effect of gender i; Cj is the 
effect of country; and ejj is a residual. The residual indicates how well the 
model describes the data. Extraordinary values in the table will, after fitting 
the model by median polish, leave residuals that stand out from other residuals. 
Differences in country effects would not only reflect differences in student 
performance across countries, but curricular, pedagogical, and cultural 
differences as well. 
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RESULTS 

The results of the median polishes of the mean percentages of correct 
responses over all items and within each content area are given in Table 2. The 
first panel shows the results over all items. The common effect of 55.64 can be 
interpreted as the typical percent of items answered correctly for this sample 
of students, and the gender and country effects indicate increments in 
performance relative to this typical score as a result of membership in 
particular categories. The cell residual indicates that portion of the mean 
score not accounted for by the common value, gender, and country. For example, 
the mean for females in the United States shown in Table 1 can be expressed as: 

48.58 = 55.64 + (-.16) + (-7.28) + .39. 

The mean percent correct for females in the United States is composed of the 
common value of 55.64, from which .16 is subtracted for being female and 7.28 is 
subtracted for being from the United States, and .39 residual points. The small 
residual indicates that the model fit for this group is good. Aside from the 
common effect, the predominant effects are those associated with countries. In 
fact, the gender effects are smaller than any of the residual values. 



Insert Table 2 About Here 



The total effects resulting from the median polishes of the five content 
areas ranged from 51.04 for algebra to 59.61 for measurement, indicating that a 
typical score for a student, without regard to gender or nationality would be in 
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the 50* to 60* range on any of the types of items. Again, after the common 
effect was removed the largest effects were associated with the countries, but 
the order of magnitude differed across subject matter areas. These effects 
reflect not only differences in the curricula and opportunity to learn across 
the eight countries, but differences in pedagogies and culture as well. For 
example, Belgium and France focus on fractions, geometry, and algebra at the 
Population A level (McKnight, Crosswhite, Dossey, Kifer, Swafford, Travers, 4 
Cooney, 1987), thus having similar curricula. Both countries have large 
positive effects in the median polish of the algebra items. A substantial 
positive effect is also seen for Belgium on fractions but the effect for France 
while positive, is small, and on geometry, the effect for Belgium is negative 
while France's remains small and positive. Similarly, the United States and 
British Columbia report approximately the same opportunities to learn across the 
content areas (McKnight et al., 1987), but the effects for British Columbia are 
consistently positive and those of the United States negative. These results 
suggest that opportunity to learn is not the predominant contributor to student 
achievement. 

Of primary interest in this study, however, was the effect of gender. The 
gender effects were very small in each of the median polishes. The largest 
gender effects were seen on fractions, but these indicated that on the basis of 
gender alone, one would expect only about 1.5 percentage points difference 
favoring females. The largest effect favoring males was found with the geometry 
items, again approximately 1.5 percentage points. With few exceptions, the 
gender effects were smaller than the residuals. 
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Examination of the patterns of residuals suggested that the absence of 
substantial gender effects was the result of interactive effects. That is, the 
presence of gender differences was a function of not only content but country. 
For example, the residuals for females in France were consistently negative, 
indicating slightly lower scores overall and on each of the separate content 
areas than would be expected after removing the effects of gender and country, 
whereas the residuals for females in Thailand were consistently ^wsitivei 
indicating higher scores than would be expected given the model. Residuals for 
both males and females in the United States were all close to zero. In the 
absence of a substantial gender effect, this indicates that the average 
performance for both males and females in the United States is approximately the 
same and is a function of only the common effect and country effect. 

While none of the residuals were extremely large in an absolute sense and 
the majority were close to zero, those greater than 1 2 1 tend to stand out. 
These residuals indicate greater differences in performance between males and 
females. Only in France and New Zealand are the larger residuals seen to favor 
males. In the area of fractions, females in New Zealand scored lower than would 
be anticipated, yet the females in Belgium, British Columbia, and France scored 
higher in this area. Females in France scored lower in algebra and geometry, 
but females in Belgium were higher in algebra and those in Thailand were higher 
in both algebra and geometry. In fact, females in Thailand had residuals 
greater than 2 in all areas but measurement, and measurement was the only 
content area with no residuals greater than |2|. 
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DISCUSSION 



The Second International Mathematics Study presents a unique opportunity to 
investigate gender differences in measures of mathematics achievement. Students 
of approximately the same ages from countries throughout the world were tested 
on the same items, allowing comparisons across countries on identical items. 
From the analysis of these data it appears as if the cultural differences 
suggested by Leder (1986) are apparent in item type performance as well as on 
composite measures of quantitative performance. 

There were no substantial gender effects in any of the content areas, and 
the slight effects shown favored girls more often than boys. These findings 
differ from those cited previously wherein consistencies were found in gender 
differences across problem type. For example, previous studies found males to 
perform better than females on problems dealing with proportionality, yet these 
results show females in Thailand scoring almost five percentage points higher 
than males on the ratio/proportion/percent items. Furthermore, within no 
content area were males found to persistently outperform females across 
countries or vice versa. The absence of these types of effects supports the 
suppositions of Wood (1976) and Leder (1986) that perhaps pedagogical and 
cultural factors lend to the presence or absence of gender differences. 

Studies examining cross-cultural differences in mathematics performance 
have identified affective factors that may also contribute to the presence or 
absence of gender differences within countries. In explaining American 
kindergarten, first and fif'.'i grade students 1 low performance relative to 
Japanese and Chinese students, Stevenson, Lee, & Stigler (1986) cite large 
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differences in the students' lives in school, attitudes and beliefs of their 
mothers, and students and parental involvement in school work. Also, in a 
national report on the Second International Mathematics Study (McKnight, et al., 
1987) the low achievement evidenced by students in the United States relative to 
other countries raised concerns about the "nature and quality of the pedagogy 
demonstrated in the U.S. mathematics classrooms" as well as "the way the content 
goals are distributed in school mathematics" (p. 9). Future research should 
investigate how the curricula, pedagogies techniques, and cultural factors 
interact with gender in impacting quantitative performance. 
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Footnote 



A two-factor (gender by country) ANOVA may appear to be called for to 
address the research question posed in this study. However, the extremely large 
sample size (N > 40,000) resulting from the aggregation of the data by countries 
produces significant statistics for each effect tested, regardless of how small 
the effect. While the linear model used in the median polish does not contain a 
specific interaction component, examination of the residuals can indicate 
possible interactive effects. If residuals are consistently positive or 
negative for one gender, the effects of country are the same for each gender; if 
not, interactive effects are present. Within each country where residuals 
greater than 12 1 were observed, independent t-tests were calculated and in each 
instance were significant with p < .01. 
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Table 1. Mean Proportion of Mathematics Items Correct by Gender and Country 

Country 

U.S. Belgium B.C. Thailand France N.Z. Ontario Japan 



Gender 
Female 
Male 



Female 
Male 



Female 
Male 



Female 
Male 



Female 
Male 



118.58 
48.13 



49.20 
48.92 



45.16 
44.56 



45.74 
46.56 



62.24 
59.00 



52.65 68.19 
52.18 61.28 



48.75 
49.76 



Total 

60.25 48.13 55.30 
58.32 43.55 58.79 

Fractions 



45.20 
47.08 



67.85 46.44 56.38 36.14 
62.18 38.64 58.74 38.74 



Ratio/proportions/percent 



59.42 62.12 
59.03 60.68 



66.89 57.07 
61.58 56.82 



58.16 40.40 
53-54 43.75 

Algebra 

37.14 59.06 
33-02 64.14 

Geometry 



45.67 
48.99 



41.35 
42.45 



53.57 
53.61 



48.39 51.14 50.92 
45.45 56.51 54.18 

Measurement 



52.55 
53-64 



50.16 
51.52 



44.06 
46.22 



52.41 
55.40 



68.09 
69.33 



82.86 
80.39 



57.90 37.24 
57.85 40.53 



69.47 
72.37 



76.13 
77.96 



Female 51.05 66.63 61.55 52.98 64.86 51.65 59.23 74.42 
Male 49.23 63.26 59.25 49.51 66.45 51.84 58.43 75.52 
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